Journal of Classification 5:249-282 (1988) 



A Maximum Likelihood Methodology for Clusterwise 
Linear Regression 

Wayne S. DeSarbo William Cron 

University of Michisan Southern Mediodist University 



Abstract: This p^r presents a conditional mixture, maximum likelihood methp-> 
dology for performing clusterwise linear regression. This new mefliodology simul- 
taneously estimates s^arate regression functions and membership in clusters or 
groups. A review of related pcoceduies is discussed with an associated critique. 
The conditional nurture, maximum likelihood methodology is mtrodoeed together 
with die E-M algorithm utilized for parameter estimation. A Monte Carlo analysis 
is performed via a fractional factorial design to examine the performance of the 
procedure. Next, a marketing application is presented ccmceming the evaluaticHis 
of u-ade $!k)w performance by senior marketing executives. FinaUy, pdier potential 
applications and direcdons for future research are idendfied 

Keywords: Cluster analysis; Muldple regression; Maximum likdStood esdma- 
don; E-M algorithm; Marketing trade shows. 



1. Introduction 

Ordinary least-squares (OLS), or multiple regrcssion, has been fre- 
quently utilized in social science research to summarize the relationship 



We wish to thank the editor and three anonymous reviewers for their insightful com- 
ments which helped to improve this mantisciipt 

Audiois* Addresses: Wayne S. DeSarbo, Departments of Marketing airf Statistics, Busi- 
ness School of the University of Michigan, Ann Arbor, MI 4S104, USA, and William L. Cron, 
Dq>artment of Mariceting, Edwin L. Cox Sdiool of Business, Southern Methodist University, 
DaUas,TX 75275, USA. 



250 



W. S. DeSarbo and W. h. Cron 



between a predesignated set of indqpoidoit variables and a single dep^ident 
variable. Let: 

/ = 1 J subjects or observations; 

y= l.....Jindepaident variables; 
yi - the value of the depenlent variable for 

subject/observation i; 
Xij- die value of the j-thindepend^t variable 

for subject/observation i; 
bj = the y-th OLS regression coeflficient; 
= error for subject/observation i. 

Then, the standard OLS linear regression model can be expressed as: 



or 



y =Xb + e, 



(1) 



(2) 



w!»re y = ((y,)), X = (QCij)), b = ((bj)), and e - ((c,)). Given an indqpi»ident 
samite of subjects/otemvatibns for y and X, one is typically interested in 
estimating bj in order to minimize the following eiror sums of squares: 



1=1 



(3) 



Johnston (1984) and others have derived the well known analytical expres- 
sion for estimating b that minimizes (3): 



b = (XX)-^Xy 



(4) 



Maddala (1976) and others that show if the assumption is made that the ran- 
dom vector e is multivariate normally distributed, then die likelihood function 
can be written (assuming £(ee') = cri, where I is an id^tity matrix) as: 



L(y I b,o^) = (27C<52)exp 



(y-Xby(v-Xb) 
20^ 



(5) 



A Maximum Likdihood Methodology 



251 



TABLE X 
Synthetic HegressloD Data 



I 


2l 




z 


Jt 




-3 


-5 


2 


1 


-2 


-3 


3 




-1 


-1 


4 


i 


0 


1 


5 


1 


1 


3 


is 




2 


5 


7 




3 


7 


d 




-3 


5 


9 




-2 


3 


10 




-I 


1 


11 




D 


-1 


12 




i 


•3 


U 




2 


-5 


14 




3 


-7 



GROPg 1 

yi - 2X21+1 



GROQ? 2 



yi ^ .2X21-1 



and the corresponding maximum likelihood estimates for b that maximize the 
likelihood function in (S) are equivalent to those obtained from least squares 
estimation (i.e*, in expression (4)). 

There are many applications that arise in the social and physical sci- 
ences, however, where the estimation of a single set of regression coefiScients 
may prove to be '^misleading/* Consider, for example, the small illustrative, 
synthetic data set provided in Table 1, with / = 2 and / = 14. If one were to 
estimate one regression function for all 14 subjects/observations, the resulting 
estimated linear function would be: 



y,=(Hf2/+0, 



(6) 



which naturally renders an = 0, a very poor summary of the stmcturc of 
the data displayed in Table 1. As seen in Table 1, if one were initially to clus- 
ter the observations/subjects into two groups, where group one was comprised 
of observations/subjects 1-7 and group two contained observations/subjects 



252 W. S. DcSaibo and W. L. Cron 



8-14, and estimate two separate cluster regression functions, thra the func- 
tions would be: 

Group 1: y,- = 2X2/ + 1 i =^ 1, . . . ,7 

Group2: --2X2i-"l 1 = 8.. .,,14, (7) 

with a combined /^^ = 1.00 indicating a perfect fit Thus, the single estimated 
regression function in (6) ^^misrepresents" or "masks'* the true structure 
present in the data. While one could legitimately argue for first plotting the 
data prior to estimation to check for such stmcture, such graphical displays 
cannot easily detect such ^^clusterings" as the dimen^onality of the problem 
(/) increases. In addition, in many types of response surface estimation appli- 
cations via experimental designs involving replications within subjects (e.g., 
conjoint analysis studies in maiketing), the independent variable set often 
remains constant from subject to subject making such graphical detection 
extremely difficult. 

This paper presents a new methodology for simultaneously estimating 
clusters and corresponding separate cluster regression functions given X and 
y from a sample of indej^ndent observations/subjects. Wc utilize finite con- 
ditional normal mixture distributions in a maximum likelihood context to esti- 
mate these parameters. We first review existing procedures that attempt to 
derive such simultaneous estimates. Next, the new methodology is presented 
together with the tedbnical detaiils of the E-M algoritiim utilized for estima- 
tion. A Monte Carlo analysis is presented to examine the performance of this 
new methodology as a number of data and program options are experimen- 
tally manipulated. A marketing application is presented to examine the 
dififerent evaluative criteria various senior manage^i utilize to evaluate the 
performance of their participation in trade shows. Finally, other £^plications 
as well as directions for future research are provided. 

2. Literature Review 

Much of the related psychometric and classification literature concerns 
attempts to rescale simultaneously the input variables and to solve for some 
clustering, all to optimize Trcpmmon objective functioa For example, 
DeSarbo, Carroll, Claric, and^reen (1984) have devised die SYNCLUS 
methodology which simultaneously solves for a partitioning and a set of res- 
caling constants for tibie variables, all to optimize one common objective func- 
tion. DeSarbo and Mahajan (1984) generalize Oiis SYNCLUS metfiodology 
to accommodate constraints, different types of clustering schemes, and a gen- 
eral linear transformation of tiie variables. De Soete, DeSarbo, and CanoU 



A Maximum Likelihood Methodology 



253 



(1985) have extended these concepts to an optimal variable weighting 
scheme for hierarchical clustering where both variable weights and 
ultrametric trees are simultaneously estimated. Note, however* that none of 
these ^roaches are appropriate for a dusterwise regressicm context with 
d^ndent and m^speaS&ot variables. 

Hie tcrai "clusterwise regression" was originally coined by Spath 
(1979, 1981, 1982, 1985). Spath developed an exchange algorithm to forai a 
partition of length K and corresponding sets of parameters b^^ such that the 
sum of the error smns of squares computed over ^ clusters is minimized: 

K 

Min Z ^ £ 1 1 X*bt -y* 1 1 ^ . (8) 

Here, to guarantee the existence of a solution bk, it is required that the rank 
X* = /. A necessary condition for this is ^ / which implies / i KJ, where 
Ik is the mmiber of observations/subjects in dustor k. SpSth's mediodology 
^ultaneously solves for the optimal feasible partition Q{K,It> and regres- 
sion weights per cluster bjk that (locally) minimize expression (8). Fbr the L2 
norm in expression (8), Spath (1982, 1985) has developed up and down- 
dating formulae for the solution of these regression problems when an indivi- 
dual ol»«rvation is added or r^oved utih^g QR-decompositions. Ifis 
stepwise-optimal method worics sequentially on the observations and is con- 
ceptually similar to K-means (MacQueen 1967). The original procedure can 
be summarized as follows: 

1. Otoose some initial partition Qi^ , . . such that l|2jkl ^-^t and 
some starting observation; 

2. Set t = t + l and reset f = 1 if t> I. Fbr 1 e Qj and 
• Si I > ^(^jfc > J)f famine whether there are clusters Qk with k ^ j 
sudi that shifting obsovation { from Qj to Qk reduces the objective 
function (expression (8)). If so, then choose Qk such that the reduc- 
tion tecomes maximal and redefine Qj = Qj- {i}^ Qk-Qk + (0- 
Otherwise retem to step 2. 

3. Repeat step 2 as long as you get any reduction in the objective func- 
tion; otherwise, stop. 

One sdects a solution with K* clusters by choosing the solution with 
minimum value of Z in expression (8). According to SpSth (1982), the final 
solution depends on the initial partition, on the starting observation, and on 
the choice of a minimum number of observations in each cluster. Because 
of problems with locally optimal solutions, Spatii (1982) recommends lun- 
ning multiple analyses for a prespecified K, altering the initial starting parti- 
tion and 



254 W. S. DcSaibo and W. L. Cron 



The primary goal of this research is to extend the concept of cluster- 
wise regression to a stochastic context allowing for the possibility of fuzzy 
clusters, as well as mutually exclusive partitions. Given the documented 
problems with locally optimum solutions in SpSth's (1985) deterministic pro- 
ceduTOt we will devise a methodology that is hopefully less prone to such 
problems. In addition, we att^pt to provide an AIC basis for selecting the 
most appropriate K* . 

3« Methodology 

The Model 

In addition to the notation developed prior to equation (1), let: 

k = 1 „.Ar clusters; 

bjk - the value of the y-th regression coeflfcient 

for the j!:-th cluster; 
a|= the variance term for the *-th cluster. 

We assume is distributed as a finite sum or mixture of conditional univari- 
ate normal densities: 

y-thfik(yi\Xij,olbj,) (9) 
k^l 



-(y-x,b^^ 
2ol 



(10) 



where X, = ((X^)),- and b^^ = ((bj))k* Th^t is, we assume an independent sam- 
ple of subjects Vobservations" dependent variable yuy2^ • • • »y/ drawn ran- 
domly from a mature of conditional normal densities of underlying groups or 
clusters in unknown proportions Xi,X2» . * . Aj^. Mixtures of univariate 
unconditional normal distributions have been the focus of many statisticians 
dating back to the seminal work by Pfearson (1894) who derived estimators of 
the parameters of a mixture of two univariate normal distributors by equating 
sample moments to corresponding populations or dieoretical moments involv- 
ing the solution of a ninth degree polynomial equation. Chariier and Wicksell 
(1924) and Cohen (1967) simplified these computations considerably using 
method of moments estimators. Hasselblad (1966) was one of the first statis- 
ticians to formulate a maximum likelihood estimation scheme for mbcture of 
two or more univariate normals. 

Note that our mixture model is conceptually similar to the uncondi- 
tional mixture approaches to pattern clustering originally proposed by Cooper 



A Maximum Likelihood Methodology 



255 



(1964). Wolfe (1965, 1967. 1970). and Day (1969), where Xihk in expression 
(12) replaces the population mean/centroid ]ik (see also Ganesalingam and 
McLachlan 1981; McLachlan 1982; Sclove 1977; Symons 1981; Scott and 
Symons 1971; Marriott 1975; Hatligan 1975. pp. 113-124; and Basfbrd and 
McLachlan 1985). In fact, expres^cms (9) aiul (10) gmeralize the Quandt 
(1972), Hosmer (1974), and Quandt and Ramsey (1978) stochastic switching 
rcgressirai models to more then two "regimes'* (see also Veaux 1986). In 
addition, the estimation algorithm employed here differs from typical method 
of moments and moment generating fimction estimation i^proadies. 

Given a sample of / uulep^ent subjects/observations, one can dius 
form a likelihood expression: 



/ 

1=1 



2 h anolr^*^ exp 



-(y.-x.b*)2 



or 



/«£.= £/« 



K 



£X*(2ito?)-'«exp 

k=l 



2<jI 



2ol 



(11) 



(12) 



Given K, y, and X, one wishes to estimate Xjt, oi, and bjt in order to maxim- 
ize L or In L, where 



K 



gI>6 



(13) 
(14) 

(15) 



It is interesting to note several properties of this fomiulation. First, unlike 
finite mixtures of other types of density functions, the parameters of finite 
mixtures of nomial densities are identified (see Yakowitz 1970; Yakowitz and 
Spragins 1968; and Teicher 1961, 1963). Second, there exist no sufficient 
estimators for the parameters of a nomial mixture (Dynldn 1961). Third, 
unless (15) is imposed, consistent estimators arc not possible given that the 
likelihood fimction is unbounded when al = 0. Fmally, note that once esti- 
mates of Xk* and bjk are obtained, one can assign each observation i to 
each duster k (using Baycs lule) via the estimated posterior probability: 



256 



W. S. DeSarix) and W. L. Don 



^2 



(16) 



This result rencteis a "fuzzy" clustering of the / subjects/observations. One 
could foim partitions by applying the nile: 

Assign z to it Pft > Pa for all l^k-UJC. 

3^. The Algorithm 

The maximum likelihood estimates of X*, b^, o| and pu: are found by 
initially forming an augmented log likelihood function to refiect the 
stndnts in expression (14): 



k^l 



k 



(17) 



Hie resulting maximum likelihood stationary equations are obtained by 
equating the first order partial derivatives of the augmented log likelihood 
fiiiiction in (17) to zero: 



do 






k 


do 






it 


do 


y ' 




i=iS^A/a(*)' 

k 



/ik(*)-ti = 0 



Bb 



(18) 



(19) 



(20) 



where fuci * ) is used for/it(y/ I Xij,cl,bjk). To estimate ji, we multiply both 
sides of equation (18) by Xn and then sum both sides over k: 

,=1 z.^kM*) k 



or 



A Mudmum Ukelfiiood Methodology 



257 



ai) 

To estimate Xjt. we multiply both sides of equation (18) by and simplify: 

k 



or 



Zp^-M = 0, (23) 



/ 

i*-'^. (24) 



In order to estimate a| and bjki we use ti^ definition otput in (16) and 
exprcss (19) and (20) as: 

Thus, ^ maximum Iflcelihpod equations for estimating the parameters a| 
and bik are weigibte^ averages of the maximum likelihbod equations 
aiog/k*) X 

— ' = 0, where 6 refiecis the parameter of interest, arising firom 

each component sq)aratdy and the weights are die posterior probabilities of 
membei^p of the subjects/observations in each cluster, tliis particular 
stracture gainfiilly lends itself to the development of a two stage E-M algo- 
rithm (Dempster, Laird, and Rubin 1977) for the estimation of these parame- 
ters (see Hosmer 1974; Veaux 1986), In the E-stage, one estimates X* and pi^ 
via expression (16) and ^4). In the M-stage. one estimates bjk and Cfi via K 
weighted least squares regressions. In order to show this M-stage, we cxpaM 
(25) and (26): 



258 



W. S. DeSailx) and W. L. Cnm 



-(y.-Xibik)^ 



exp 



2o| 



2(yi-Xib*)Xi 



2oi 



= 0 



(27) 



which are identical to the stationary equations derived by solving ^ 
weighted least squares problem where y and X are each weighted by Pi^ . 
Thus, the entire set of b^ is derived by perfbnnihg K separate weighted least- 
squares analyses. Once.this is done, tiie estimates of a| follow: 



JO ^ ^ 1 

+ X;t(2iK)fJr^«exp 



Xifixp 



-(yi-Xib*)2 



2ai 



(-l/2(27colr^^2jc) 



-<y,-Xibjt)2 



2cl 



l/2Cy.-Xib*)2 



= 0 



i-i 



-1 (y.-x,bt)2 
+ 



2ai 



2af 



= 0. 



(28) 



Multiidying both sides of (28) by 2ai^ and simplifying, one obtains: 



(29) 



Thus, Cit can be obtained during the JKT weighted least-squares procedi^ies for 
estimating bit. Note, because (17) becomes unbounded as al -^ 0, Bk is set 
to a default small positive value (.01) if it becomes small during these itera- 
tions. 

Thus, the computation of the maximum likelihood estimates is facili- 
tated by tiie use of this E-M algorithm. For given starting values of the 
parameters, the expectation (fi phase) and maximization (M phase) steps of 
itais algorithm are alternated until convergence of a sequence of log likeli- 
txood values is obtained. Dempster, Laird, and Rubin (1977) prove that: 



A Maxmam Likelihood Methodology 



259 



(30) 



where m is the iteration counter, indicating that the E-M algorithm provides 
monotone ihoeasing values of the objective function. Given the constraint 
Cft^ .01, one can show tiiat O is bounded from above and convergoice to at 
least a local maximum can be establidied (cf. Titterington, Smith, and Makov 
1985). While several authors (e.g., Everitt and Hand 1981 and Redner and 
Walker 1984) have documeaited the potentially slow convergoice rate of E-M 
procedurBS ibr estimating the parameters of unconditional roixtuit di^bu* 
tions, we find that our E-M procedure tyi»cally converges in 100 or less itera- 
tions. Moreover, the iterations are processed much faster than if a gradient 
based procedure had been used. Acceleration procedures discussed by Peters 
and Walker (1978)* Wilson and Sargent (1979)* and Louis (1982) arc 
currehtiy being uiyestigated. We provide a Monte Cario analysis in the next 
inajor secticni.to investigate the perlbnnance of this B-M algorithm in a: tear 
soiuibly rigorous manner. 

Our aipproach to identify tiie appropriate number of clustei^ K* in such 
mixture clustering procedures (cif. Sclove 1977) involves the use of the 
Akaike Information Criteria (Akaike 1974) whidi is defined as; 



where h(^r) is the e£fective number of parameters estim^ued iii a iT clustcrvns6 
regression »3lutiQn: 



This AIC criteria has been previously used m an unconditional 
mixture/clustering context by Sclove (1983). However, as pointed out by 
Bozdogan (1983) and Sclove (1987), one major problem with the use of such 
ai crilericnii is tiiat iixb theoretical justification for use of AiC relies on the same 
conditions as the usual asymptotic theory of the GLR test In this context, 
some analytical conditions required for series expansions yielding; the AIC 
are not strictiy met (see McLachlan and Basfoid 1988, p. 28; Sclove 1987), 
and the criteria can be tiius regarded as "heuristic figures of merit" where 
one selects K* which renders minimum AICiK). 

Note that tiie likelihood ratio criterion for testing the hypodiesis of ATi 
versus K2 clusters, where K\ < K2, does not have its usual asymptotic distri- 
bution as mentioned by Hartigan (1977), Binder (1978), and McLachlan and 
Basford (1988, p. 27). Basford and McLachlan (1985) have adapted Wolfe's 
(1971) approach in introducuig a constant to improve a approximation for 
tht likelihood ratio test However, the reliability of such as approximation 



(31) 



(32) 



260 



W; S. DeSaiix) and W. L. Cron 



will naturally depend on the size of /. McLachlan (1987) has recently exam- 
ined the boot-strapping of the log likelihood ratio statistic to assess the null 
distribution of - 21og L. Furdier leseaich is lequiied in this area. 

Oast of the j^i^alihg piopdiies of iiia>dmuin likelihood estimators is 
that, under typical regularity cdhditions, these ^limators are asymptotically 
normal. £>efine b as a vector of all the (bi,b2, . . • .bjf) estimated coefl&ieiits 
in a maximum likelihood context, and B as the corresponding vector of 
unknown population parameters (Bi,B2, . . . .B^:). Then, according to Theil 
(1971). 



'tf' (b - B) 4 iV(p, /^(/?(B) / /)-! ) 



Where: 



dBdB' 



(33) 



(34) 



the Moimation msutrix. Accpiding to Judge, GrifiSths, fipm Lfltkq>phl, and 
Lee (1985), replacing /£m(J?(ii) / /) by a consistent estimator does iipt change 
the asymptotic distributiGn of the test statistics or confidence intervals tbtb. 
Here, the consistent estimator utilized is: 



F = 



1 



1=1 



ab* 



(35) 



and the asymptotic variances of b can be defined as tile; main dia^iial fit- 
ments of F"* , the asymptotic variance covariance matrix. From (33) - (35), it 
follows that an asymptotic (1 - a) 100% confidence interval for is giv^ 

■by 



(36) 



wherc Zff/2 is the central value of a normal distribution with mean zero and 
variance one and^ is the asymptotic estimate of the variance ofbn. 

33. Synthetic Data Analysis 



The synthetic data in Table 1 were analyzed by our conditional mixture 
E-M based procedure. Table 2 presents a statistical and computational sum- 
mary for 1 to 4 clusters. As clearly delineated in this table, the ^ = 2 
cluster solution is the "best" one given that the minimum AIC is obtained 
here. The recovered parameters are also shown in the table for this small 



A Nfeximum Likelihood !Nfetfaodology 



261 



TABLE 2 



Conditional Mixture Haxiimua Likelihood 
Procedure Results for Synthetic Date 



Number o£ Iteratioria 
K Required for Convergence In L AIC: 



1 2 

3 7 

4 a 
^ihln\ua AID 



-12, 47.73; 



Recovered Parameters : 

1 0 

X 0 

X « (;s .5) 1 0 

1= 0: 

1 d 

1 b 

0 m. (,5 .5) p i 6: 

" 6 i 

0 I 

6 1: 

0 1 

0 i 



illustrative data set. We perfonncd 20 additional computer nrns for the two 
cluster solutiohi varying the starting estimates of bjt using the unifbrai distri- 
bution U( - 2,2). The starting values produced log IflceUhcxxls in the range of 
-39.13 to ^1.32. In ail 20 cases, &e procedure converged within 5-7 itera- 
tions to this same £^pbally optimum solution piesented inTable 2. Given that 
tfie startmg values were generated fipm the same distribution as the actual 
values, we perfomied an additional 20 computer runs utilizing U{- 20,20) 
for die starting bj^. This generated initial log likelihood values ranging from 
-69.95 to -432.83. Here, we recovered tiie actual parameter values shown m 
Table 2 in 18 of die 20 cases, showing some deteripradon in performance as 
die **quality** of die starting values deteriorated. Note tiiat SpStii's (1985) 
procedure was able to recover the actual b)t values in 14 of 20 computer nms 
m the version of the computer program we initially purchased from him. 



262 W. DeSarbo and W, L. Cron 



TABLB 3 

Independent Factors and Levels Utilized in the Monte Carlo Analysis 



FACTOR LEVELS CODS 

A. Number ot Clusters (K) K«2 2 

K«3 3 

K»4 4 

B. Number of Independent J«2 2 

Variables (J) In X Jo5 5 

i^s; 8 

C. Number of Observations: (I) laSO 50 

I- 100 100 

UiSQ 150 

D. Differebee In Scale of Oj^ 

e.g. , f or k«3 0 = 1,2.3^ 1 

£-2,4,6 2 

Bi Coiaparatlve Range of Hixing Equal A 1. 

Proportions {K^) Unequal A *s 2 

F. Distribution of each bj^ bj^^(p,l) I 

b^j^->f(k,l) 2 

b^j^^Cik^l) 3 

2 2 

G. Estimation Option for 0^ Oy^ fixed at true 1 

values <1); 

2 

estii&ate Oy^ 2 



4. Mont6 Carlo Analysis 

In order to examine systematically the perfomiance of the conditional 
mixture E-M algorithm, a Monte Cailo analysis was perfonned where some 
seven factors were experimentally varied: KJJ, scale of aj, the distribu- 
tion of bjb, and cl estimation options- Table 3 describes these seven indepen- 
dent factors and the various levels tested for each factor. These seven factois 



A Maximum Likelihood Methodology 



263 



TABLE 4 

3^2^ Asymmetric Fractional Factorial Design 



TRIAL: 


A 


B 


c 


2 


B 


F 


G 


L 


2 


2 


50 


1 


1 


1 


I 


2 


2 


2 


50 


1 


2 


2 


1 


3 


2. 


2 


50 


1 


1 


3 


2 


« 


2 


5 


100 


3 


1 


1 


1 


5 


2 


5 


100 


3 


1 


2; 


2 


6 


2 


5 


10 b 


3 


2 


3 


1 


7 


2 


8 


150 


2 


2 


1 


2 


8 


2 


8 


150 


2 


1 


2 


1 


9 


2 


8 


150 


2 


1 


3 


1 


id 


3 


; 


icq 


2 


2 


2 


2 


il 


i" 


2 


idd 


2 


1. 


3 


1 


12 


3 


2 


100 


2 


1 


i 


1 


13 


3 


5 


ISO 


1 


1 


2 


i 


14 


1 


5 


150 


1 


2 


3 


I 


15 


3 


5 


150 


I 


1 


i 


2 




3 


8 


50 


3: 


1 


2 


1 


17 


3 


8 


50 


3 


i 


3 


2 


18 


3 


8 


50 


3 


2 


i 


1 


19 


4 


2. 


150 


3 


i 


3 


1 


20 


4 


2 


150 


3 


i 


i 


2 


21 


4 


2 


150 


3 


2 


2 


1 


22 


4 


5 


50 


2 


2 


3 


2 


23 


4 


5 


50 


2- 


.1 


1 


i 


24 


4 


5. 


50 


2 


1 


2 


i 


25 


4 


8 


100 


i 


1 


3 


1 


26 


4. 


a 


100 


1 


2 


1 


i 


27 


4 


8 


100 


1 


1 


2 


2 



weie: combined via ah asynuhetiic j^actiohal faqtbrial design (cf. Adddmah 
1962) for Qiaih efSscts only estimation; Twraly^seveii experimental tiiais 
were devised where the seven factors were varied according to the 3^2^ frac- 
tional factorial design portrayed m Table 4. Such procedures have been pre- 
viously utilized by DeSarbo (1982) and DeSarbo and Carroll (1985) m the 
psychometric literatwe for preliminary testing of new algorithms, Note that 
eadi trial (or row) of the experimental design defined a specific level for each 
of the seven factors listed in Table 3. Based on the stipulated levels of J and 
/, X was randomly generated from a uniform distribution. Given designated 
levels of oi, and b^, y was generated via the mkmre specification 
described in equation (9). and the conditional maximum likelihood cluster- 
wise linear regression procedure was iexecuted. 

The dependent/performance measures coUected were: 

1. Number of ito'ations required for convergence. This was to meas- 
ure the computational effort involved in processmg given CPU time 
was unavailable. Note, a maximum of 100 iterations was spedfied. 



264 



W. S. DeSaibo and W. L. Cron 



2. AICa - AICr. this is the difikience in the Akaike infonnation 
statistic between that obtained for the actual values generated syn- 
thetically for the Monte CailQ analysis (A/Q) and that obtained via 
i5x& soltitioh recovered by the methodology (AICn). Note that tiiis 
difierence was taken to eliinihate the dependchce of such a measure 
on /. / and K, Positive values for this diflference would indicate the 
methodology recovered a better solution than compared to the actual 
parameters, 

3. bk paraineter recovery. A root mean square between the actual bj^ 
and estimated hk (alter appropriate permutai$on) is calculated to 
measure how well tte ptoceduie can recover tilie dusters* regressi(m 
coefibients; 

4. Ok paraiiieter recovery. A root mean square between the actual 
and estimated (after appropriate permutation) is calculated to 
me^Rue how well the pnx»duie can leoivisr these parameters. 

5: Put recovery. A root mean squaije between tiKSe actual postnior 
probabilities and estimated pue. is calculated (after appropriate permu- 
tatidn) to measure how well the procedure can reproduce these dus- 
ter membership probabilities. 

6; Xfc recQvery. Finally, a root mean square between the actual inixing 
propoitioiis and esdinated is calculated (afto* sppropriate permuta- 
tioh) to mieasure how well the procedure can rocover the^ mixing 
proportions. 

Thus, these six measures encompass the three major areas of algoritiim per- 
formance: compuutti(m£d demands, data teprodut^ozu and parameter 
rccoveiy. 

Table 5 presents the results for these six deperaient measures for each 
of die twenty-sev^ trials designated by the asynimelric fractional factorisd 
design presented in Table 4. As can be seen, m Table 5. only two of the 
twenty-seven trials (#18, #19) Med to reach convergence widiin the max- 
imum limit of iOO iterations. Also, note that all twenty-seven tri^ resulted 
in positive values for the second dq)eiKlent measure indicatihg the procedure 
always provided estimates whose resultant AIC statistic was better (lower) 
than that produced by the actual parameters, the dififeroice in the magnitudes 
between the four RMS dependent measures 3, 4, 5, and 6 reflect the 
differences in the scale in the numbers utilized as parameter values rather 
than better/poorer fits. Table 5 also presents the correlations between these 
six measures. Of particular note is the rather laige positive correlation 
(0.764) between dependent measures #3 and #4 indicating an assodation in 
attempting to recover the parameters (b^, oi) of the conditional normal distri- 
butions. Note, in none of die trials were tiiere estimates of oi near 0.00. 



A Maximum Likelihood Methodology 



265 



TABLE 5 

dependent Keasure Results from the Monte Carlo Analysis 



DEPSCTEWT MEASUREii 



iJKlALtl 


1 


5 
* 


3 


4 


5 


6 


1 ■ 
1 






1 , 242 


1 . 000 


.559 


.051 


A 


in 


• V ^ 7 


A • y w X 


1 .000 


. 744 


. 229 


5 




7 AA7 


« 468 


. 404 


551 

« 6** 


. 137 






0» I CM 


■ O J,U 


4 l/VV. 




, 049 


e 
9 


CO 




OAA 




* i 


.017 


P 


m 


0 . T / i. 


■) • voy 






.382- 


/ ■ 


LY 
•n 


17, 137 


.698 


. 432 


. 248 


,134 


8 


32 


15.715 


2.489 


2.000 


.647 


.051 


9 


io 


26v284 


3.476 


2.000 


.761 


.051 


io 


37 


5.465 


1.918 


4,031 


. 328 


.235? 


ii 


51 


10.761 


3.161 


2,828 


.379 


.13i2 


12 


37 


4.686 


1.652 


3 . 266 


.346 


.078 


13 


34 


8.910 


. 900 


.816 


.266 


.023 




40 


9.826 


4.231 


1.414 


.606 


.171 


15 


84 


2.169 


1.662 


.604 


.425 


.259 


16 


55 


26.423 


4.560 


4.899 


.452 


.173 


17 


20 


34.840 


6.917 


4.714 


.480 


. 156 


18 


100 


30.223 


4.892 


4.243 


. 486 


. 265 


19 


100 


10.832 


8.497 


5.612 


.372 


.206 


2b 


57 


6.254 


2.424 


2.655 


. 247 


.171' 


21 


6^ 


11.633 


6.551 


3.674 


.374 


.232 


22 


15 


40.187 


5.806 


4.345 


.337 


.210 


23 


29 


36.779 


7.729 


4.000 


.399 


. 170 


24 


56 


23 . 237 


3.801 


2.828 


.313 


.147 


25 


62 


14.230 


3.978 


.707 


.412 


. 094 


26 


94 


54.187 


4.200 


1,225 


.500 


.250 


27 


44 


95.596 


3vl73 


1.576: 


.473 


.059 



CORRELATlOMSs 





1 i 


3 


4 


5 


6 


1 


1,000 .075 


.273 


.170 


-,165 


.273 


2 


1,000 


.354 


.163 


.161 


-.025 


i 




i.ooo 


,764 


.241 


• 395 


4 






1.000 


.121 


.480 


5 








I.OOO 


.245 


6 










i.obb 



Table 6 presents the results of tte six regression analyses perfonned, 
one for each of the dependent measures. Here, as in conjoint analysis (cf. 
Cre^ and Rao 1971). the design matrix is converted into dummy variables 
piior to tiie regression analysis. Such a methodology has been similarly used 
in De Soete, DeSarbo, Rimas, and Carroll (1984), DeSaibo and Carroll 
(1985). and DeSarbo (1982) in the Monte Caiio testing of new methodolo- 
gies. 

Estimating solutions with larger numbers of clusters significanfly 
increases the number of iterations required for convergoice. While not sig- 



266 



W. S. DeSaitx) and W. U Cton 



TABLE 6 

Regress ion Analyses o£ ttonte Carlo Results 



DEgEMDEST KEASURB ; 





1 


•2- 


3. 


4 






IHTBRCEPT 


34.06 


1.33 


0.96; 


0.38 


0.48 


O.iO 




23.11* 


4.07 


1 . 59* 


1.85** 


-0l03 


o.bs 


Kt4 


3oaj*^ 


21.81** 


3.40** 


1.83** 


-0.07 


0.05 


J«5 


-is. 00 


8.49 


0.08 


-0 .80 


^0.04 


-6.01 




5,56 


27.6i*i^ 


b.ia 


-6*30 


0^ 10 


-6.02 




10.33 


-0;21 


-1.65* 


-lil7* 


-0.08 


-0,04 


I- 1 50 


13.89 


-11.03 


-0,76: 


-r6.91* 


-o.oi 


-0.04 


e of 2 


-12.44. 


-2.33 


0.95^ 


1.89** 


-0.05 


-0.01 


0 of 3 


9.00 


-6.65 


1. 83* 


2.26** 


-O.ll 


0.05 


tttJEQUAL, Alt 


1.89 


1.06 


0.53 


0.36 


0 . 08 


6.12** 




-U.33 


4.201 


0.15 


d.ii 


0.05 


-6.02: 




-20.11 


-0.02 


1.59* 


O - 84 


oilo 


o.pi 


0^ fStiMATEO 


-7.61 


7.09 


-i>P9 


-6.36 


-0,14 


-6.pt 


S.E. 


22.74 


1S.21 


1.32 


0.85 


0.15 


0.08 


r2 


0.59 


0.69 


0.81 


0.86 


0.55 


0.59 


adj R? 


6.23 


0.43 


0.65 . 


6.74 


oa6 


6.24 


F 


1,66 


2.65* 


4.96** 


7 . 10** 


1.41 


h68 



* p<-p5' 
**Pi.01 



niticantt data with laiger / said lai:ger / al^o tend tp incieasQ tlKf xiumbef of 
iterations. Thus, lai^er data sets and spiutioris estimati^ laiger humbeis of 
parametefs tend to increase computational dmandte, aldioug^ die regiessiorx 
equation as a whole is not significant 

A somewhat suiprising finding is seen with respect to the second 
depciKlent measure concerning the diflference in AIG statistics produced by 
die actual parameters vs. recovered ones. Here, as the number of ^mated 
parameters increase (JC = 4, / = 8), the procedure is somewhat more likely to 
recover parameters with associated lower AIC statistics than compared to that 
produced by the acmal parameters. Thus, as the dimensionality of the param- 
eter space increases, all else held equal, there is a greater chance of finding a 
better solution. Note that tiiis regression equation is significant atp ^ ,05. 

Some unantidpated results are found concemirig the regression 
analysis conducted with dependent measures #3 involving recovery. 
While the result that solutions involvirig larger number of clusters (and more 
parameters) tend to detract from recovery and larger sample sizes enhance 
hk recovery make intuitive sense, the positive and significant impact of the 
larger al scale and bji^-N(2KA) levels are a bit harder to interpret 



A Niaximum LikeUhbod Metfibdology 



267 



Prcsumediyi as ct| gets laiger, the variance of bt increases rendering larger 
errors in recovery. Note that Uiis regression equation is quite significant 

Some siniilaF patterns are also seen with respect to regression analysis 
peijKirined on dqpeiident mieasiiire #4;, 0^ recovery. Here, ajt rea>very is 
reduded as the number Of terms Mcnreas^ and aus tfie s^e and differehbe of 
tIHe 0jfc ino^ase; AU else held e^ increases, it 

becomes significantly easier to recover the o* Values, this makeis consider- 
able sense ill li^t of traditioiM statistical estunation theory ccnceihuig the 
inq>act of: higher dcsgrei^ M^^^^ in estiniiuipii, Again, this legressicQ 

eijimtiba i$ 

The final tivo regjieissicSi eM]uatipns for dependent measured #5 (pfi 
recovery) and #6 (X;t recovery) are not significant For the Pit RMS equation, 
no factor level is significant For the RMS equation, it appears that 
estimating unequal tesiids to detrswit firaii Xjt 
i^on epadoA again is 

llhitjil^ the Mpnte CMp ai^ieais to re»ilt in scv<a^ rather 

interesting findings. As the number of parameters to be estimated increases, 
all else held equal, computational time will increase and parameter re(X)very, 
in general, will sulBer as is the case in most iipnlinear estimation probl^s. 
Similaiiy, increasing the sampte size, holdiilg s31 else equal, may also 
increase cprnputationial danands, tnit will tyj^cMy improve panuneter 
recovery. Rnally, iiicreasiiig tft^^ variance of the parameters to be estimated 
will also tend to result in poorer parameter recovery. However, given the 
preliminary nalmte of the^ results mfst be^^p^^ to fiuurthe^ 

testing. 

Some obvious linaitatkms of this Mmite Gaiio: aiialysis miist be noted« 
The use of the fractional factorial design does not allow the flexibility of 
measuring possible interaction effects between these factors shidied in the 
analysis. Qearly, assuming computational time/expense was not a limitation, 
a fiili factorial desi^ would have been a more comprehensive desipi to use 
in order to esidmatie posdble significant M^r order interaction tenns. ta 
addition, the design should have been replicated in order \o improve the 
degrees of freedom for estimation. Finally, more levels for each oflhe factors 
should be investigated, and other factors (e.g., cluster size and shape) intro- 
duced in the design. We leave these projects for future research. 

5. Application — IVade Show Performance 

5.1. Study Description 

Trade shows are promotional events used by mailceteis to draw a large 
number of prospective buyers to view exhibits of products/services in a few 
concentrated days. Such trade shows have become a very popular mediuin 



268 



W. S. DeSaibo and W. L. Cron 



for promoting products and services, especially in the industrial sector. 
Qeaver (1982) published figures indicating that over 91,000 firms display 
such exhibits at some 8,000 trade shows to over 31 million prospective buyers 
at a total cost of $7 billion annually. Many fiims will aUbcate up to 25% of 
their total promotion budget for trade shovfs (Mee 1983a). Historically, trade 
show participation has been viewed as an extension of a firm's personal sel- 
ling effort However, Bonoma (1983) revealed that trade shows have a much 
broader role than merely generating sales. Many firms consider such non- 
seUing factors as image entumcement, gathering competitive infonnation, and 
unproving corporate morale as equal to, if not more important, than identify- 
ii^ leads on making sales. 

Recently, Kerin and Cron (1987) conducted a survey of trade show 
exhibit managers and senior marketing executives in 129 firms that were 
heavy participants in trade shows. One of their objeoives was to investigate 
the selling vs. non-selling role of trade shows. A self-administered question- 
naiine was separately mauled to tiie trade show exhibit manager and the seoior 
maiketing executive in each firm. We will puiposely focus on the latter ques- 
tionnaire sent to the senior mariceting executive since it focused on percep- 
tions of trade show performance and various maiketing-related variables 
idoitified in tihie literature ais affecting mch perceptions^ These maiketing 
executivcKS w&e asked to rate the firm's trade ^ow pesrfoimance on scnne 
eight fimctions documented in tiie literature (see Haas 1982; Bonoma 1983; 
Hutt and Speh 1985; Dunn and Baiban 1986): 

1. Identifying New Prospects 

2. Servicing Current CJustomere 
3; Introducing New Products 

4. Selling at the TVade Show 

5. Enhancing Corporate Im^e 

6. Ibstihg of New Ftoducts 

7. Enhancing Corporate Morale 

8. Gathering Competitive Information 

Overall trade show performance was rated also. Each of these performance 
aspects was rated on a 7-point Likert type scale (l=very poor; 7=very good) 
which we shall treat as metric scales (cf. Guilford 1954, pp. 15-16; and Green 
and TMH 1978). In addition, data on a number of individual difiference items 
were collected (we will describe these later). Kerin and Cron (1987) had per- 
fbmaed a factor analysis on the eight performance functions listed above and 
uncovered two dimensions accounting for 59.1% of the variance, roughly 
corresponding to the selling and non-selling roles of trade shows conceptually 
identified by Bonoma (1983). Our investigation wiU examine a multiattribute 
analysis of tiie perfoimance function data where we shall use overall trade 



A Maximum Ukelihood Mediodology 



269 



TABLE 7 

Total Sample Regression Results on Trade Show Ferformance Sata 



INTERCEPT 


3 .03 


Xl 


0. 15***. 


X2 


^0.02 


X3 


0.09 


X4 


•0.04 


xs 




X6 


0« 18*** 




0.07 


% 


0.04 


S.E. 


0.8S 




0.i7 


adj r2 


0«33 




87.67 


F 


g.87*idk'. 


I 


129 



** p<.05 

*** pT.di 



show perfoimance as the dependent variable and the data on the eight perfor-^ 
mance fimctiomi listed above as the mdependent variables. CNir go£d is to 
examine whether ^ups of fihns evaluate overall trade s^ow, perfomiance 
differently in tenns of these d^t aspects, and if so, estimate their different 
regressi(m coefiicients and group membetdiip pn>babilitt«( via tfie new 
dus^rwise-lihear regression methodology discussed. 

SJS. PrieUminaFy Ajnalyses 

We will analyze the data for overall performance (y) and the ei^ per- 
formkK:e fuiK^tioi^ ^i, . . • ^t) for these: 129 firms, Tte^ihg all 129 execu- 
tives as members of one large cluster or group, l^te 7 presents the itsoltiii^ 
regression analysis of regressing overall perfoimance ori the eight perfor- 
mance functions. As can be seen, identifying new prospects (Xi) and new 
product testing QC^) J^pears to be most significantly related to evaluations of 
overall trade show perfoimance. Thus, for the entire sample, it appears that 
these two selling-related aspects dominate the analysis for the entire sample. 
The issue remaining is wheflier there exist distina groups of firms whidi 
exhibit different regression coefGdents. 

In order to address this research issue of group regression coef&aents, 
we initially applied Spath's (1982, 1985) clusterwise linear regression pro- 
cedure. We ran 20 trials for each solution from JC = 2 to 5 and utilized 



270 



W. S. DeSaiix) and W. L. Cron 



Spath's minimum objective function mle to select both the number of clusters 
and the particular solution. The K = 2 cluster solution was sdected using 
SpSth's minimum objective function rule. Table 8 contains the best K = 2 
cluster solution obtained ftom these analyses. Tttis table presets multiple 
regression analyses and corresponding asymptotic significance tests or each 
of the two derived groups. The coefiScients are identical to those obtained 
from Spilth's procedure, but significance tests are missing from SpSth's pro- 
cedure since it is deterministic. While it is not good practice to consider 
these significance tests ^ropriate (since the data were initially utilized to 
form the groups), we merely present them as "heuristic figures of merit" in 
order to gain some insight into the stmcture of tiie data as derived from this 
alternative methodology. Hie first cluster of some 72 executives i^pear to 
derive their overall perforaiance evaluation on primarily non-selling fiinc- 
tions sudh as enhancing corporate image and morale (Xs and X7) and new 
product introduction and testbg (X^ and Xg). Not6 the significant negative 
coefficient on selling at trade shows (X4)i This is a cluster of mariceting exe- 
cutives who appear to stress particular non-selling and new products aspects 
of their trade shows. The second cluster, however, is not as cleariy inteipret- 
able. Here, identifying prospects QCi) and new product testing (Xg) are the 
most significant fimcticnis impacting on overall trade show perfonnance 
evaluation, aUhou^^ tiiese relationships are not as strong as those repotted in 
the previous cluster. As such, this second cluster of 57 mariceting executives 
dippem to resemble the total group stracture as roported in Table 6. 

53. Coilditipnal Mxtqre Maximum Likelihood I^rocedure Results 

Our conditional normal mixture maximum likelihood methodology was 
applied to these data for K = 1 to 4 clusters. Table 9 pre^nts the number of 
ilBratiohs required for convergence, In L, and AIC statistics for each solution. 
According to the niinimum AIC mle, the = 2 cluster solution wg^m to be 
the best one and will thus be reported here. Table IQ presents a summary of 
the various parameter values and statistics for this two duster solution. Ous- 
ter one, composed of 59 mariceting executives, evaluates trade shows pri- 
marily in terms of evaluations on noii-scUing dimensions including servicing 
new customers QC2) and enhancing coiporate image and morale QCs and X^y 
Note the significant negative coefficients for hioroducing new products QC^) 
and selling at trade shows QCa) which also substantiates this non-selling 
orientation. The second cluster of 70 mariceting executives appears quite 
different than fliis first cluster. Here, identifying new prospects intro- 
ducing new products (X3). sdling at trade shows (Y4), and new product test- 
ing {Xs) are highly significant The significant negative ooefiRcient on senric- 
ing current customers (^2) helps sub^antiate this "selling** orientatioa 



A Majdmum Likelihood Methodology 



271 



TABLE 8 

Spath's Clusterviae liinear Risgression Two Clutter Solution - 
Multlpia' Regression Analyses 



IHTERCEFT' 

Xl 

*2 

X3 

H 
S.B. 

adj.R2 
S.S.E.. 

I 



CLUSTER 1 


CLDStkR' 2 


2,27 


3.72 


0,09 


0. 18* 


•0.00 


-0.02 


0.15** 


-0.06 




0.04 


6; 21*** 


0.04 


0.27*** 


dvl5: ** 


0L09* 


0.00 


0:;04. 


o.ii 


0.73 


0.94 


0.56 


0.31 


0.51 


6.20 


33.91 


41.99 


io.d7*M 


2,7S** 


72: 


57 



* p<. 10 
** p<,05 
01 



TABLE 9 

Coaditibnal Mixture Haxixbum Likelihood Procedure Results for K^l-A Clusters 



k 


Number' of Xieratlons 
Required for Convergence 


In L 


AIC 


1 


2 


^158.1 


336.3 


2 


32 


rl4i.6 


325.2* 


3 


62 


-132.7 


329.4 


4 


46 


-130.9 


347.8 



* Minimum AIC Solution K-2 



272 



W. S. DeSaibo and W. L. Cron 



Hiis solution ^qppears to be more congnient with previous literature (cf. 
Bonoma 1983; Haas 1982; Hutt and Speh 1985; Dunn and Baiban 1986) and 
the empirical results reported in Kerin and Cron (1987) than does the SpSth 
two-duscer solution. In addition, the eflbcts are stronger here than in SpSih's 
solution producing higher adjusted R^'s when placed in % deterministic 
regression context hi fact, this conditional nonnal mixture maximum likeli- 
hood solution obtains a lower Spath objective function (expression 8) value 
than the SpSth = 2 solution! Table 11 presents a cross classification of 
membership for the SpSfli and conditional mixture &M based procedure. As 
Shown« only 68 of the 129 executives are classified sin^aily, The resulting 
phi coefBcient calculated from diis table is only 0.065 indicating little associ- 
ation between the two classifications. The SpSth solution produced a log 
likelihood value of -156.99 when substituted in the conditional mixture E-M 
based procedure as compared to -^l4l.5i3 for the sohition reported eailier in 
Table 10. Using the Spatti sohition as an initial starting soluticm for the con- 
ditional nuxturc based E-M procedure produced (after 11 iterations) a solu- 
tion with a log likelihood value of -142.53, whose values had correlations 
with those in Table 10 of .996 and .983, and whose X and vdues differed 
by .01 and .02 respectively. At any rate, it is interesting to see how running a 
total group analysis sudi as reported in Table 7 can mask the true structure in 
a set of data. 

Having identified two schemes used by marketing managers to evaluate 
their overall trade show performance, the usefulness of our classification was 
evaluated by mmpAng to describe the factors distinguishing between the 
two groups. Mudi of what has be^ written concerning trade i^w manage' 
ment is descriptive of tte experioices of managers, involved in aspects of 
trade show management (e.g., Cavanaugh 1976; Hatch 1981; Konikow 1983; 
Rich 1985). A number of studies that are descriptive of trade show manage- 
ment have been supported by the National Tnade Show Bureau (Mee 1983a, 
1983b, 1984). Periiaf^ the first effort to systematically analyze trade show 
management was UUen's (1983) research on trade ^w budgeting and parti- 
dpati(Hi. This study idoitified factors related to how mudi an individual firm 
spent on trade shows and to which shows the firm participated. The research 
by Kerin and C!ron (1987) on die determinants of high trade show perfor- 
mance evaluations also provides a good framewoik for identifying Actors 
related to trade show performance. 

Based on this review of the literature and interviews with marketing 
and exhibit managers, a list of factors were derived which are potentially 
related to whetiier mariceting managers evaluate their trade show performance 
primarily on selling or non-selling dimensions. A complete list of the indivi- 
dual difference items collected hi tiie Kerin and Cron (1987) stiidy along vyith 
a description of thdr measurement are provided in Table 12. The variables 



A Maximum Likelihood Methodology 



273 



TABLE 10 

Conditional Mixture Maximum titcellhood K»2 Cluster Solution 



CLUSTER I 



CLUSTER 2 



IKTERCEPT 

S.S.Ev 
I 



4.093*** 


2.218***^ 


0.126 


0.242*** 


6.287*** 


^0.164*** 


-0, 157** 


0.206** 


-0/133*** 


0.074** 


0.128* 


0.072 


0. 107 


0.282*** 


0.155** 


-0.026 


-0.124 


0.023 


0i73 


0.76 


0,69 


0,73 


20.37 


12 . 98 


59 


70 


6.489 


b.sii 


0.589 


0.504 



** pT. 05 

*** pT.bi 



Hembership Con^arlsons for Trade Show beta Analyses 



Spath*s Cluster : 



1 2 Totals 

Conditional 1 35 24 59 

Mixture I E^H 
Procedure * 8 

Cluster; 2 :J7 33 70 



Totals 72 57 129 



i74 



W. EieSaibo and L. Cron 



TABLE 12 

lodepend^nt Variables £or ?ro£iI log Evaluation Groups 



Variable; 



Description 



INDUSTRY INFLUENCES ; 

Stage of industry life cycle 

Degrae ot product customieat ion 
Major industry group 



Five point iscaie: introduction,, grovth, 
early maturity, maturity, and decline. 
Percent of sales in customized products. 
Percent of sales in each of the folloWlngi 
raw materials, component parts, major 
capital, equipment, operating supplies, 
consumer durables, consumer, hondtifables, 
ad services 



COMFAHY IHFLPEKCES ; 

Annual sales volume 
Kumber of direct customers' 
Sales concentration 
Technical complexity 

Trade show budgeting 

Importance to top management 

Sales growth 



Dollar figure 
Number 

Percent of sales to top ten customers 
Five point Llkert scale (1 «* technically 
siraple. to 5 « technically complex). 
Percent of sales; promotion budget spent on 
trade shows 

Five point Llkert scale (I « Not importantly 
5 - Very itDportant) 
Last year's percent 



TRADE SHOW lyEtPENCES t 
Wriiten objectives^ 



Existence of formal written objectives for. 
overall trade show effort:; 



MARKETIHG MANAGER INFLUENCES ; 

Length of time in position 
InvolvetDeht in show decisions 



Years in present position 

A summative index Including involvement 

in budgeting, policies,; evaluation,: setting 

objectives^ participation, and yorklng with 

exhibit manager on a fiv^ point Llkert 

scale ( 1 « minimal ly invplVad to 5 - 

extensively involved).^ 



^Exhibit manager is the key informant for this variable, while the marketing 
manager provided information on the remaining, variables.- 

^The Cronback alpha for this measure was .$7. 



are organized into a firework simaarto^ to^ use4 by Kerin and Cron (1987), 
which consists of (a) industry influences, (b) company influences, and (c) 
trade show strategy influences. In addition, a fourth set of influencing factors 
were considered in this study and are referred to as the maiketing manager*s 
influences. This was considered to be appropriate because the maiketing 
manager's historical and current involvement with trade shows may influence 
hi$/her perfomiance evaluations. 



A M^pcimiun likdttiood l^tbodolo^^ 



275 



Given the posterior probabilities of membeiship in tte two derived 
clusters, i.e., the p^'s, a logit transformation was performed on the probability 
Uiat a marketing manager evaluated trade ^ow performance primarily on a 
s^iliing dimension. Spedfibally, the log was takoi of the ratip of the filing 
cliist^ meniber^jp probability diyidsd by oi» mlmisi the selliirig clu^ 
membership probabilitir (adjustments of adding/subb-adyb^ a small posijiiye 
constant were made for pit = 0 or 1); Multiple regression analysis was per^ 
fbraied using the 20 independent variables listed in Table 12; Because of 
missing data for some of the independent variables, 102 finns were included 
in this analysds; The results of this analysis are poesented in l^le IS. (Step- 
wise lUultiple regressicsi analysis was also perfiDnncd for more parsiinonious 
results. The results were congment with Oiose in Table 13.) Rve variables 
were si^ficant in the elation; high technolo)^ products, new product intro- 
diictiohs,. sales i^^ncentration, import tdp management, and percent of 
promotion budget Specifically, maiicedng manageifs who are most likely to 
^phasize sdling iesults fn^ Grade sh^w partici^ aiejinth^ 
sell high tech p]^iac^ m freipent ite introductiipns, have 16ly 

customer s4es concentrations, aUbcate a loW percent of the sales promotion 
budget devoted to trade shows, and have top management who consider trade 
shows very important to the organization's success in meeting its marketing 
obgectlyes. 

An aiternatiye approach for evaluating ]die praii^cal usefulness of ttese 
20 independent variables is to determine how well they can predict whether a 
mariccdng manager evaluates trade shows primarily on a selling or non- 
selling basis. Mariceting managers were placed in either a selling or non* 
sdljmtg group b£^ ^ Cagheri TtdspiD<xdi|i^ 

in 48 managers bdng placed in the seUmg group and the iemaimng 54 
managers categorized as non-selling (given the missing data). Two group 
multiple stepwise discriminant analysis was used to distinguish statically 
between the two groups. The resulting discriminant function contained ten 
significant varialdes and produced ai Viliks* lambda o widi chi-square of 
63,312 (p < .0001). The ten variables in order of significance are (see Table 
14) product technology, fiequiency of A&w product introduction, sales conceit^ 
tration, impoitance to top management, percent of promotion budget, selling 
information processing equipment, sales growth, mailceting management's 
involvement, selling raw materials, and written trade show objectives. Hie 
results are quite similar to those presented in Teblt 13 concerning the logit 
regression analysis. These results indicate, in comparison with maiketiiig 
managers who evaluate trade shows primarily on non-selling dimensions, 
those evaluating on selling dimensions: (1) sell more highly technical pro- 
ducts, (2) frequently introduce new products, (3) do not concentrate their 
sales to a few laige firms, (4) have their top management consider trade 



276 



W: S. DeSaibo and W. L. Cron 



TABLE 13 

Ii«glt Zraasfonned Regression Results for Selling Evaluations 



Variable 


Beta 


-t ■ 


High tech productis 


32 


2. 794** 


New product introduction 


.34 


2.862** 


Sales concentratiph (%) 




1.958* 


Import ancQ to top management 


*23 


1.920* 


Cercent: oC promotipn budget 


-.22 


1.928* 


Sales growth; 


,02 


.169 


Marketing manager -s involvemeot 


.05 


.441 


Vritteh show objectives 


.19 


1.678 


Industry life cycle 


-.08 


.656 


Product modification (X) 


-.12 


.974 


Harketitig manager *s eatperiftncc 


^.02 


.177 


Number of customers 


- . 17 


1.345 


size 


.05 


.356 


Selling raw materials; 


flS 


1.377 


Sel I ihg cpmpphe nt part s 


-\i8 


1.584 


Selling fiiajor capital ecjuipment 




.742 


Selling operating supplies 




.106 


Selling consumer durables 


-.05 


. 448 


Selling eonsiimer nondiirables 


.07 


.542 


$elli!eig iiifbraatipn prpcefsalng equipm^zit 




1.181 



F ■ 2.217** 



.05 

**p : < .01 

shows to be more important; for achieving the finn's maiketing objectives, (5) 
spend less on trade shows as a percent of total promotion budget, (6) are not 
If :ely to be selling information processing equipment. (7) experience higher 
srJes growth, (8) are firms in which the maiketing manager is more intimately 
iinrolved in trade show related decisions, (9) are more likely to be selling raw 
materialSt and (10) are more lilcely to have writtm objectives for their overaU 
Lide Show program. 



A Maximum Likelihood MeUiodology 



277 



TABLE 14 

Discriinlnaxit Analysis of Performance: Bvaluat ion Groups 



Variable 


Hean 
Selling 
Evaluation 


Values 

Non-Selling 
Evaluation 


Standardized 
DiBcrlminant 
ruaciioa 
. Coefficient?! 


High tech products 


4.42 


3.33. 


.7215 


Fre<}uent new. product introductioa. 


2.42 


1. 67 


. 6439 


High sales concent rat ion i%) 


24.42 


39.47 


r- 6426 


Important to top managenieht 


3.92 


3.31 


r*l329 


?«rcant of promotion budget (%) 


14; bis 


20.69 


-.5e44 


Selling information processing 
equipment (%) 


9.21 


11.94 


•^746 


Saies growth (?) 


2d,9S 


14.47 


.3309 


Markating manager's involvement 


25.11 


22; 86 


. 3047 


Selling of raw materials (X), 


12,71 


6 , 13 


. 2867 


Written shov objectives 


1.27 


1.58 


vissp 



Canonical; Correlation .781 

Villics Lambda 0.388 

Chi square 63.312 

Significance .001 

Cpfrect classification rate <Z) 81.>37 



^All variables are significant at the .01 level or higher. 

The above results c»n^ of finiis w w^^ 

tit primarUy w for their sSeUing effectiyraess^^^ 
and logical. In general, selling oriented firms have a story to teU (e.g., new 
snd high tech products), Have a wide audience to leach, have writtati objec- 
tives because trade shows produce results that aie quantifiable and central to 
the success of the organization, display intense marketirig iiivolvraient, arid 
have the support of top management Discussion of these results with indus- 
try experts indicate that the iridustry results are also consistent in that 
manufacturers of infomiation processing equipment, especially the larger 
orgaiiizations, do not actively sell ori the trade show floor, while selling is 
quite common for maiketers of raw materials. The most surprisirig result is 
that selling oriented organizations spend less as a percent of total sales pro- 
motion budget on trade show participation. This may reflect the cost 
efiiciency of trade shows versus traditional field sales force selling (Mee 
1982). 



278 



W. S. DeSaibo and W L. Cron 



Here, 81.3% of the organizations were correctly dassified as evaluating 
their trade show programs on a selling versus non-selling basis. This percen- 
tai|e of conea classifications was compared to a proportional chance cri- 
tedon of .466 (Morrison 1969). Using a test of ^ difference between two 
proportipns. Z = 7.857 liyhich is extremely significant (p ^ .001). As ^ test of 
the upward bias in the classification results caused by reuse of the sample 
data, the Lachehbruch (1967, 1975) hpldput procedure was used to classify 
individual oigani?:atipns, ITie validated classificatipn rate was 76.84%. Tins 
further indicates the pitdictor vanables are imppitant disxiiriniiiui in this 
{plication. 

6.: pisiciission 

The conditional mixture maximum likelihood methodology for duster- 
wise lir^ar regression has been tedhnicaiiy described as weU as the E-M algo- 
rithm for estimatiprL A Mpnte Gado analysis investigating the performance 
of the methodology as a number of data and model factors were experimen- 
tally varied was presented. Finally, an plication to trade show performance 
evaluations collected from senior marketing managers illustr^ed how two 
different groups of managers utilized very different criteria to evaluate tixxi 
piomotionai expei^itures in trade shows. 

TTiere are cleariy other potential applications for this new methodology. 
For example, this clusterwise linear regression methodology could be utilized 
in the general context of multiattribute models for altitude measurement In a 
similar vein, the E-M based procedure could be adapted for use in conjoint 
analysis studiiss (Green and lEtao 1971) to investigate the basis of preference 
or choice. More substantive applications exist in virtually all the social sci- 
ences. In psychological testing, for example, this methodology could be util- 
ized to identify groups of respondents that perfomi particularly pooriyAvell 
on spedfic items of a test Ccmceining managerrient research, die methc^ol- 
ogy could be utilized to relate firin strategy tP resultihg corporate perfor- 
mance and identify '^strategic gnnips*' (Boiter 1980) or dusters of firms that 
utilize profiles of strategy to ^tairi similar performance. Finally, in the area 
of political science, the prociedure could be used to group countries with 
respect to common factors produdng political risk levels (cf. Kraydibudil 
1985). 

In addition, the methodology caii be extended in a number of direc- 
tions. For example, tiiis conditional mixture approach could be modified to 
accommodate a binary choice or a rate dependent variable involving mbctures 
of other distributions from the exponential family. Like Basford and 
McLadilan (1985), die procedure can be graeralized to accommodate three- 
way analyses where. Sot example, y and X could be given for various differs 



A Maximum Likelihood Methodology 



279 



time periods or over various experiments manipulations. Another area of 
potential research involves modifying the procedure so user stipulated con- 
straints as discussed in DeSaiho and Mahajan (1984) can be ^onx^ (cf. 
DeSarho, Oliver and Raii^wamy 1988). Finally an intei^ling generaliza- 
tion would be to accommodate the estimation of multiple (py duster) 
tdtramelric or path length tiee(s) from such data. 



References 

ADDELMAN, S« (i962)» ''Orthogonal Main ESbcts Plans tot Asymmetrical Factodal Experir 

itmitSi**Techtomeirics, 4, 2146. 
AlpakB, fl^^ Statistical M(»id Identiigcatbn.^' tEEE fransactums 

mAutomaHc ComraU 716-723. 
BASFORD, ICR, and MCLACHLAN. OJ. (1985), *Thc Mixture Mcdiod of Clustering 

Applied to Th^^ 
B110ER» iDA* (1978X **Bayesian Cluster M^ysis,'* Biometrika, 65, 31-38. 
BQhIQMA, T:V. (19©), **Gct More Out of YwirlVade Shows,** adrvardBudness lUvim, 61, 

75^3. 

BtiZIX>G^ R (1983). "Oetomiiung the Number of Component dusters in Standaid Mul- 

dvariace Nomial Mixture Models Using Model-Selecdon (Trifierioru'* Technical Report 

VIC/Ddk/A83-l,Aimy Researcli Offi^e^ 
Q^VANAUGH* S. (1976), ''Sening Objectives and Evalualing tfie ESectiveness of Trade 

Show Exhibits,- Journal cfMarkeiing, 40, 100-103, 
CHARUER, CVJL.^ and WICKSEIJU SD. (1924), **Oh die EHssecdbn of B^uezicy Fiance 

tidns,' * ArJbv forMatematik Astrmomi Och Fysik, Bd. 18. No. 6, 85-98. 
O-PAVER. J. (1982), ^ *%u poif t Haw io be a Star in diis Shov^** A^iw^^ Age, 53, 9. 
CX>HEilN[, AC. (1967), '•fetiiriadon in iMixtures of iSi^O I*)rmal Distribudoits,** TecA- 

w^ricj, P, 15-28. 

COOPER, P. W. (1964), * 'N<m Supervised Ad^live Signal Ctetecdon mi P^^t^ Rm>gnir 

don," Information and Control, 7. 416-444. 
DAY; N-E. (1969), * 'Estimating die Components of a Mixtm bf Nonnal Dislribudons/* 

Bhmeirika, 56, 465^4. 
DEMPSTER, A.E'.. LAIRD, NJd., and RUBIN, D3. (1977), * •Maximum likelihood from 

hiconq)Iece Data Via die E*M Algdridmi," Jomhai of tMRtiyaiSui^s^ B39, 

1^8. 

DESARBO, W.S. (1982), ;'<3ENNCLUS: New Models for General NonhieraxcKcal Ouster- 

ing Analysis^ ■ • Psyckametrika, 47, 449-476, 
DESARBO, W.S., and CARROLL, I.D. (1985), *Three Way Metric UnfcHing Via iMtemat^ 

ing Weighted L^ast Squares," Psychometrika, 50, 275-300* 
DESARBO. W.S.. CARROLL, jj).. CLARK, LA., and GREEN, P.E. (1984). ••Syndiesized 

Clustering: A Mediod for Amalgamating Altemadve Clustering Bases with Difibrendal 

Weighting of Variables," Psychometrika, 49, S%1%. 
DESARBO, W.S., and MAHAJAN, V. (1984X ^'Constrained Classification: The Use of a 

Priori Infoimaticm in Cluster Analysis," Psyctumetrika, 49, 187-215. 
DESARBO, W.S., OLIVER, R., and RANGASWAMY, A. (1988), "A Simulated Annealing 

Mediodology for Oost^rwise Linear Regression," Wc^idng Paper ^ Univetsity of Michi- 
gan, Ann Arbor, MI. 



280 W. S. DeSaitx) and W. U Cron 



DE SOETE, a. DESARBO. W.S., FURNAS, and CARROLU JD. (1984), *The 

Represetttati<m of Nonsyi^ Rectangular Proximity Data by Ultrametric and PaAi 

Length Tree Stcuctmc,'rPsychometrika, 49, 289-310. 
De SOETE, O., DESARBO, W.S., and CARROLL, JD. (1985X ••Qptinial Vaiiable Weighting 

for Hierarducal Clustering: An Alteniatmg Least Squares Algorithm,** Journal igf 

Classificmion, 2, 173-192, 
bUNN,S.W., and BARB AN, AJM. (X986). Advertising, 6th ed. Hinsdale, XL: Dryden. 
PpnaN, EJB. (1961), ''Necessary a^ Sufficient Statistics for a Fanuly of Pirobatnlity Distri-^ 

butions/VSe/ecifed Translations in liiaJ^^ Statistics and l^duibUity^ Proyidencei 

Rl: American Mathematk 
EVERHT, BS., and HAND, DJ. (1981 )i Finite Mixture Distrilmtio/K New Yorie Chapman and 

HaU. 

GANESAUNGANf, S., and MCLACHLAN, GJ. (1981), ♦'Some Efifciency Results for flie 
Estimation of die Mixing ftopoitioh in a Mixture of Two Nonnal Distributions,'* 
Bia^i£h4cs,37i2i-33^ 

GREEN, PJB., and RAO. VJR. (1971), '^njouit Measuxement for (Quantifying Tttdgmmtd' 

Data,** Journal of Marketing Research, 8, 355-363, 
GREENi P.E., and TULL, D.S. (1978), lU:^icavhfi>r Marketi^ Deci^ons.AA liA, En^ewood 

diffe, NJ: Ptoitice-HaU. 
GUIUK)Rp, jJi. (1954). Psychometric Methods. 2nd ed. New York: McGraw-Hill. 
HAAS, R.W. (1982); Industnat Marketing Managem^, 2nd ed.^ iBoston: Kent. 
HARTIGANi JA. (WSy^qiusteririgA^^^ New Yoik: Wiley. 

HARtlGAN, JA. (1977), ^'Distribution Problems in Clustering, in Classification and Cljis- 

terifig, etL j. Va« Ryni% New York: MMsmc Press, 4J-7i. 
HASSELBIAD, V. (1966), ''EstinutOan of Parameters for a Mixture of Normal DistrUMons,"' 

TechmmetricSr SrAZlMA. 
HATCH, Mi (1981), How to Improve Sdl£s Success at frade Shows, New Canaan, CT: TVade 

ShowBineaii; 

HOSMIR, D.W. (1974), ' 'Msdoo^^ Uk^hood Estimates of the Parameters of a Mixture of 
TWO Recession Lines*'* Cofmutikatiohs in SUitis$ics, 3, 995rVm. 

HUTT, M.D., and SPEH, /LW. (1985); Ihditstrial Mdark£ting Management^ 7xA ed, Hinsdale, 
IL: Dryden. 

JOHNSTON, I (1984), Econometric Methods, 3rd ed. New York: McGraw HilL 

iUIXSE, G.C., GRIFFrraS, W£., HILL. R.C., LUTKEPOHL, H., and LEE. IJC. (1985), 

Theory and Practice of Econometrics^ New York: Wiley. 
KERIN, RA, and CRON, W.L. (1987), "Assessing Trade Shpw Functions and Performance: 

AnExjploiatory Study^^Jourmt of Marketing, 51. 87-94, 
KpNIKOW, R3, (1983), Haw to PgrticifMe Pr^^ Trade Shows, rev. ed, Chicago: 

Dartiien. 

lOlAYENBiJEHL, t£. (1985), Country Risk: Assessment and Monitoring^ Lexington, MA: 
Lexington. 

LACHENBRUCH, PA. (1967), *'An Ataiost Unbiased Method of Obtaining Confidence 
Intervals for the Ftobability of Misclasdificailion in Discriminant Analysis,** Biometrics, 
2J. 639-645. 

LACHENBRUCH, PA. (1975), Discriminant Analysis, New Yoric: Ha&ier Ptess, 

LEUEN, G.U (1983), *'A Descriptive Model of the Trade Show Budgeting Decision Process,** 

Industrial Marketing Management, 12. 25-29. 
LOUIS, TA. (1982X •'Finding the Observed Mormation Matrix When Using the E-M Algo- 
rithm,** Journal of the Royal Statiaieal Society, B44, 226-233. 



A Maximum Likelihood Methodology 281 



MACQUEEN, h (1967), ''Some Mediods for Oassification and Analysis of Multivariate 
Observadons/* in the 5th Berkeley Symposium of MaihenuUics, Statistics andPrcbabU- 
ity^ YoL 1« eds. LM. LeCam and J. Neyman, Los Angeles* CA: Umveisity of Califonua 
Piess, 281-298. 

MADDALA, G.S. (1976), EcMometrics, New Yoik: McGraw-HUL 

MARRIOrr^ FJH-G. (1975), "Scpaiating Mixtures of Nonnal Dismlbutions/' Bicwneflrfc*, 31, 
767^769. 

MQJVCHLAN, GJ. (1982)» 'The Classification and Mixtuxe Maxiimun Likelihood 

Apf^mhfis to. C^f^ Analyst ybl. 2, eds«PJR.Kri8]^^ 

and hit. Kanal, Amst^dbm: I^orth-HoOandi 199-208. 
MC1ACHLAN» OX (1987), "On Bootstrapping flte Likelihood Rtdio Test Statistic for the 

Number of Conqponenis in a Nonnal Mxmt^** Applied Statistics, J6i ii%4^ 
MCLACHLAN, G J. and BASFORD. ILE. (1988)» Hfixture Models: Irference and Applica^ 

tions to Clusterings New York: Marcel Dekker. 
MEE, W.iy. (1982X T)rade Show BAiNi Co^ Arudjm^f East Orleans, MA: Tradfe Show 

Bureau. 

MEE, WW, (l?83a). Trade Shaw Industry Growth J972'I981, Prelected Growth 1981-1991, 

East Orleans, MA: Trade Show Bureau^ 
MEE, W.W. (l983bX the EMbitors: Their Trade Show Practices, East Orkans, MA: Trade 

ShowBuieait 

MEB» W,WL (1984), Audienci Charcu^eristics ~ Regional ^ National Trie^ Shaws^ Bast 

Orleans, MA: Trade Show Bureau, 
MORRISON, a (1969), "On the Ihteipretaliott of DisCTiminaht Analysis,'* JMmal cfMark^ 

eting Research, 6, 156^163. 
PEARSON, K. (1894), '•Coritributioa to the Mathemadcal Thewy of Evolution," Phihsophi- 

ccd Transactions of the Royal Society cf London, A, 185^ 7M10. 
PETERS, B.C., and WALKER, H.R (1978), ''An Iterative ntocedure £c»r Obtaining Maximum 

Ukelihood Esthnatcs of the Parameters for a MixQTO of Nom SlAM 

Ji^mudonA^ 

PORTCR, M.E. (1980), Competitive Strategy, New York: Free Press. 

QUANDT, R£. (!?72), "A New i^jpioach to Estimating Switdimg Regressksis,* ' Journal of 

the An^icgn Statistical Assod^^ 
QUANDT, RJB., and RAMSEY, j.B^ (1978), "Estimating Mixtures of Normal DistxifHitions 

and Switching Regressions^" Journal of the Americcat Statistical Association^ 73, 730- 

738. 

REbNER, R A, and WALKER, H.F. (1984), "Mixture Densities, Maximum Lil^lihood, axKl 

flic E-M Algorithm," SlAM Review, 2, 195-239. 
jRICH, M. (1985), "Regional Shows Give Small Madcetefs an Even Break,'* Marketing News, 

May 27, 19, p. 15. 

SCLOVE, S.C. (1977), 'Topulotion Mixture Models and Clustering Algorifluns," Communi- 
cation U Statistics, A6; 417 -434. 

SCLOVB, S.L. (1983), "Application of the Conditional Population Mixture Model to bnage 
Segmentation,** IEEE Transactions on Patient Analysis and Machine Intelligence, 
428^33, 

SCLOVE, S.L. (1987), "Application of Model-Selection Criteria to Some Problems in Mul- 
tivariate Analysis," Psychtmetrika, 52, 333-343. 

SCOTT, A.J., and SYMONS, MJ. (1971), "Qustcring Methods Based on UkcKhood Ratio 
CAten^'' Biometrics, 27, 238-397. 

SPATH, H. (1979), "Algoritiun 39: Clusterwise Linear Regression," Computing, 22, 367-373- 



282 



W. S. DeSaiix) and W. U Cron 



SPATH, H. (1981). *'Conecti<m to Algoridim 39: Qusterwise Unear Regrei^ioiu" Ccmpui. 
%.25.275. 

SPATH, H. (19$2), '^Algorithm 48: A Past Algonthm &r Clusterwise Linear Regression,** 

Computing, 2P, 175-18i. 
SPATH, H. (1985X Cluster Dissection atulAiudpi^^^ 

SYMONS; M I (1981)^ "Clustering Oiteria and Multivanate Noimal Mixtures,- ' Bumetrtcs, 
37,35-43. 

THCHER, a (1961X '*Identifiabiiity of Mixtui^ Anmls of Mathematical Stati^ics, 32, 
244-248/ 

TEICHER, R (1963)v'*Uesitifia1n^^ of Finke Mixtures/* 
M 1265 1269- 

THEEU H, (1971), Principles ofEcmometrics, New Yoik: Wiley . 

fSJimmatm and MAKOV; VM. (1?85); Statiaicai Anafysk c^^ 

Finite Mixture Distributions, New Yoiic: WUey. 
VEAUX;, RD. (1986)« 'Tararneter E«dmation fcHt a Mixture of Linear Regressions*** TecfipuV 

C(d Report No. 247, Departmenl of Sutistics. Stanford Univereity, S tanfordL CA. 
WILSON, I>,L, aikl S ARGENT, R,<5, (1979), - ^Some Results of Monte Carip Expraments in 

l^dmaiing the Parunetcis of the F^te Mixed iEk{xmehtial I>^triimtibh, - - in the 

Proceedings cf the Twelfth Armi^ Symposmr^ ed^ /JF. Geralenu^ 

of^€Uertpo/Ontm 

WdLFB^iM, Q965), ^'A ^tmp^cr Pribram for theMmrnum tJkeKhdod^Anat^is of Types:' 

Techucal Bulletin, 65-15, U.S. Naval Personnel Reses^rch Acdvity, Sm 
WOLFE, (196;^, * 'NORMDC; Conqputattonal MeAods for Estimating Ae Parameters of 

Multihr^date N^^ Distributions/* ttesetxrch Memt^randi^ SXAf 69*2, 

U.S. Naval Pcrsoiina Res^aith Activity, San Diego, CA. 
WpLp. IJHL (1§7Q), 'Tattem Clusttcring fay Multivariate Kfixtras AnaJysis/ V^ 

Beha9U>rai Research, 5/329-!3ia 
WOLFE, JJi- (1971), **A Monte Carlo Smdy of the Sampling Distribution of the Ukelihood 

Ratio for Mxtur^e of Middnpim I>i$ttibuti6mi ''^ r^ Nd^ai Persprm^ 

andtrainingRes^ t%gO, CA, 

YAkOWTTZ, S J. (1970), * ^Unstqtevi^^ Learning aiKi the Identificaticm of Finite Mixtises,^ * 

IEEE Tj^ansQcdansonlr^nuU^ Theory^ and Cdntrdti 1T-J6, 330^338^ 
YAKpWnZ, S Ji, and SPRACMNS, LD, (1968), "On tto ^ert^ Afixtures,^^ 

AimO^ Statistics, 39, !^4l4. 



